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ABSIBACT 

This document describes the construction^ 
implementation^ and implication of selected high inference measures 
applied in a study of teacher effectiveness in the third/ fourth, and 
fifth grades. Selected independent variables served as hypotheses 
regarding which behaviors are likely to occur during concept 
instruction ai;d which are lil-aly to be relevant to student concept 
learning. Two basic assumptions guided the selection of relevant 
behaviors: (1) Teacher behavior should be examined in terms of 
intent. Intent may be derived fro^ instructional objectives. (2) 
Relevant process variables should De derived from existing 
theoretical or empirical bases that provide support for expecting 
certain relationships between instructional behavior and student 
outcomes. For this investigation, a record of cla5::sroom communication 
between teacher and students was made on audio-tape recordings. 
Analysis of classroom interaction between teacher and pupils included 
evaluation of how accurate and complete was the teacher's knowledge 
of the subject and how effective was the teacher in conveying 
concepts to the pupils. Teaching techniques were analyzed in the 
light of resulting student understanding and achievemjent. (JD) 
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SELECTED CORRELATES OF EFFECTIVE TEACHER 
BEHAVIOR DURING CONCEPT INSTRUCTION: 
THEIR DESIGN, UTILITY AND LIMITATIONS 

Researchers and teachers have long been interested in questions surrounding 
the nature of effective teaching. The traditional teacher effectiveness re- 
search paradigm sought correlates between teacher personality, experiential, and/ 
or aptitude variables and criterion variables of student ratings or student achieve- 
ment. The results of such studies have yielded little useful knowledge (Gage, 1963). 

Receatly, a more productive approach has been sought through research on the 
nature of instructional environments. One way of defining a learning environment 
is in terms of the behavioral characteristics of its participants. The reasoning 
underlying this emphasis is that the dominant features of an environment depend 
upon the typical characteristics of its members and that certain environments tend 
to reinforce or to extinguish specific behaviors. It is assumed that instructional 
environments differ in the particular behaviors they reinforce and thus tend to 
produce differential effects in terms of the nature, quantity, and quality of stu- 
dent outcomes. 

It ma-- indeed be the case that optimal learning environments differ according 
to the nature of the anticipated student outcome(s). That is, sub-environments 
which are highly dissimilar may exist, even within one classroom; These varying 
settings may serve to reinforce differing learning outcomes.. It is conceivable, 
for example, that a sub-environment which reinforces divergent, creative thought 
processes may not promote the learning of specific facts and genoralizacions. 

One of the tasks of investigators who attempt to identify optimal learning 
environments is the determination of relevant aspects of sub-environments which 
are likely to reinforce specific behavioral outcomes. In this connection, the need 
for a taxonomy of situations and learner outcomes is apparent. One might -ask: 



(2) 

(a) What are the basic categories of school goals (such as concept and gener'* 
alization learning, the development of divergent thinking skills, the development of 
problem creation and solution skills)? 

(b) Are there identifiable sub-environments which optlsially promote such 
learner outcomes? 

Needless to say, such a taxonomy does not exist. We continue to view school 
learning in terms of subject matter goals rather than in terms of general skills, 
abilities, and attitudes which are supported by the various academic disciplines. 
Most process-product research efforts are conducted in the context of describing 
teaching as it occurs with! a subject matter parameters with little attention being 
given to the types of learning or student achievement being promoted or even to 
the teacher ''s intent, as reflected in course or instructional objectives. 

Gage (1963) and Rosenshine and Furst (1971) have urged teacher behavior in- 
vestigators to conduct studies such as those being suggested here, in which speci- 
fically defined aspects of teacher behavior are examined. Such micro-effectiveness 
studies could examine teacher and student behaviors in terms of instructional in- 
tent, as reflected through statements of objectives. It may be useful to concentra^:e 
some process-product research efforts around categories of student achievement in 
order to investigate possible treatment by outcome relationships. 

One category of student outcome is that of concept learning . The study reported 
here is one in a projected series of investigations aimed at identifying and vali- 
dating the characteristics of classroom sub-environments which promote optimum 
levels of concept learning. One long-term goal of this research is to test the 
generalizability of optimum concept learning sub-environments across types of learners 
and across subject matter. The purpose of this paper is to describe and critique 
the conceptualization and utility of the independent teacher process variables and 
the instruments employed to measure these aspects of teacher behavior. 
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Identifying Relevant Teacher Variables 

Given the lack of a theory or conceptual model of teaching from which to 
select behaviors, the task of identifying relevant teacher and student process 
variables is the first critical step in the effort to collect valid and reli- 
able data on the teaching act. Selected independent variables serve as hypotheses 
regarding which behaviors are likely to occur during concept instruction and 
which are likely to be relevant to student concept learning. Two basic assump- 
tions guided the selection of relevant behaviors. 

(1) Teacher behavior should be examined in terms of intent s Intent may be 
derived from instructional objectives. 

(2) Relevant process variables should be derived from existing theoretical 
or empirical bases which provide support for expecting certaiin relationships be- 
twesn instructional behavior and student outcomes* 

The variables in this study were derived from previous process-product investigations 
and from experimental studies of concept learning. The focus was on the verbal 
cognitive aspect of the teacher's task rather than on all possible dimensions. 

The generation of particular teacher process variables was facilitated by 
asking the question: What are the characterisitcs of a concept instructional event 
which relate logically to clear, effective instruction? This question was answered 
as follows. In preparation for a concept instructional sequence, a teacher must 
respond to at least three pragmatic concerns. 

(1) What particular knowledge is needed to achieve the instructional objectives? 
This question refers to the substantive aspect of instruction. 

(2) What terminology ought to be employed to transmit meaningful ideas most 
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effectively to learners? This question refers to the semantic aspect of instruction. 

(3) What particular logical, procedural moves ought to be made during the 
lesson to meet the instructional objectives most effectively? This question refers 
to the strategic aspect of instruction. 

Substantive, semantic, and strategic components of instruction served as major 
categories for the generation of teacher antecedent and process variables. Follow- 
ing is a brief explanation of each instructional component along with the names of 
the variables employed in each category • 

Substantive variables . The substantive aspect of an instructional event 
refers to the body of knowledge explicitly made available to students during the 
lesson. In a primarily discussion-mode lesson, much knowledge is made available 
through teacher discourse or explanation. Scriven (1959) suggests the three 
criteria of accuracy, adequacy, and relevance for satisfactory explanations. 
These three criteria can aid in the identification of relevant antecedent as well 
as process variables. Three antecedent questions are: 

(1) How accurate is the teacher's knowledge (of the relevant subject)? 

(2) How adequate or complete is the teacher's knowledge of the subject? and 

(3) Is the knowledge which the teacher is able to venerate relevant to the 
knowledge demanded by the specific instructional objectives? 

Five variables related to the substantive aspect of instruction were employed 
in this study: 

Accuracy (1) The concept definitions given or developed by the teacher are 
accurate. 

(2) The concept examples given or accepted by the teacher accurately 
represent the concept. 
Adequacy (1) The teacher explicitly states the necessary knowledge components 
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(as implied by the instructional objectives). 
(2) The teacher explicitly states the necessary concept labels or 

names (as implied by the instructional objectives). 
Relevance (1) The teacher's verbal behavior is appropriate to the achievement of 

the instructional objectives for the lesson* 
Substantiation for the accuracy, adequacy, and relevance, variables can be 
found both in previous process-product investigations and in concept learning 
studies. Positive relationships have been found between the variable "oppprtunlty 
to learn the criterion material" and student performance. The "opportunity to 
learn" variable is similar to the adequacy and relevance variables employed in 
this study. Rosenshine (1972), Shutes (1969), and Hus^^.n (1967) found significant 
positive correlations between measures of opportunity to learn and student achieve- 
ment. In both the Rosenshine and Shutes studies, actual tape-scripts of lessons 
were assessed to determine the extent of content coverage. In Husen's study, 
teachers rated whether their students had the opportunity to learn the type of 
problem(s) represented by the test items.. 

An important aspect of concept instruction is that the concept examples 
illustrate the critical dimensions of the concept. Experimental investigations 
on concept learning support the principle that as the critical properties of 
the concept become more obvious, ease of concept attainment increases (Clark, 1971). 
Inaccurate concept examples should, then, inhibit the efficient learning of con- 
cepts. In addition, experimental concept learning studies have shown that asso-ria- 
ting the critical properties and instances of concepts with the concept name or 
label increases the ease of subsequent concept attainment. This aspect of concept 
instruction is reflected in the adequacy of concept label coverage variable. 

Semantic variables . The semantic aspect of an instruction event refers to 
the teacher's ability to convey meaning through appropriate choices of terminology. 
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For this study, the semantic component was expanded to include syntactics, which 
deals with the rules governing word order. 

As children progressively become able to perform formal operational thought 
processes, verbal language becomes increasingly more important as the medium of 
instruction. For children in grades three through five, verbal language itself 
is a major component of instruction. Semantic variables ought to play a critical 
role in the assessment of effective communication in instruction. 

Three semantic variables were examined in this study. 

(1) The teacher employs a balance of concrete and abstract terminology. 

(2) The teacher speaks in complete, rather than incomplete, choppy sentences. 

(3) The teacher uses pronouns which clearly refer to their antecedents. 

Certain semantic abilities of the teacher could be identified as antecedent 
predictor variables to be ex-^mined in future studies. Abilities which logically 
relate to verbal performance during concept instruction might include such measures 
as verbal fluency and divergent production of classes. 

Strategic variables. Instructional strategy refers to the total set of 
movements, or operations, performed by the teacher to achieve the instructional 
objectives. A strategy is comprised of smaller elements or purposive moves. A 
purposive mova refers to an activity aimed at progressing the lesson from one sub- 
stantive point to another. An utterance is a verbal expression performed by one 
person at a given time.. An utterance may contain a single purposive move, or may 
contain several purposive moves. The specific purposive moves identified for this 
study were derived from instructional variables found to relate to concept attain- 
ment and from results of process-product investigations. The purposive strategic 
moves and the anticipated direction of their relationship with student achievement 
were: 
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(1) The teacher gives a concept definition* (positive) 

(2) The teacher asks students to give a concept definition, (positive) 

(3) The teacher gives a positive or negative concept example, (positive) 

x4) The teacher asks students to give positive or negative concept examples, 
(positive) 

(5) The teacher reviews and smmnarizes the main ideas in the lesson, (positive) 

(6) The teacher asks a low order question. A low order question prompts 
students to engage in recall or translation as cognitive processes, 
(positive) 

(7) The teacher asks a high order question. A high order questions requests 
students to engage in cognitive processes of comparison/contrast , analysis 
application, or evaluation, (null) 

(8) The teacher changes or shifts the topic of the lesc^on. (null) 

(a) The teacher signals a shift in the topic, (positive) 

(b) The teacher employs a sinnmary-signal-shif t pattern, (positive) 

(c) The teacher shifts the topic while asking a low order question, 
(negative) 

(d) The teacher shifts the topic while asking a high order question, 
(negative) 

(9) The teacher asks a pair of questions in a series, not allowing time for 
student response, (negative) 

(10) The teacher answers his/her own or a student's question by explaining, 
(positive) 

(11) The teacher repeats his/her own question, following a student's response, 
(negative) 

(12) The teacher rephrases his/her own question, following a student's response, 
(negative) 
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(13) The teacher tells students to stop irrelevant behavior; or, the teacher 
engages in irrelevant behavior, (negative) 

(14) Other behavior. Teacher's utterances which contained none of the above 
purposive moves were coded in this category. 

Two additional strategic variables were added on the strength of results of several 
procp.ss-product studies . 

(1) The teacher expresses enthusiasm and interest in the content of the lesson 

(2) The teacher displays an on-task approach coward the classroom atmosphere 
and its interactions. 

Designing the Observation Instruments 

Two major phases are apparent in the process of measuring classroom behavior; 
(1) securing a record of a sample of the behaviors to be measured; and (2) quanti- 
fying the record (Medley and Mitzel, 1963). 

For this investigation, a record of classroom communication between teacher 
and students was made on audio-tape recordings. Twenty-two teachers of students 
in grades 3, 4, and 5 were instructed to conduct two concept lessens of forty-five 
minutes each, on the economic concept specialization. Teachers were provided with 
background knowledge on the concept and with a set of instructional objectives 
for the two lessons. Fifteen children were randomly . identified, from within the 
intact class to which the teacher was assigned; this group became the teacher's 
instructional class. Two full days prior to instruction, students were pre-tested 
on a criterion-referenced measure matched to the instructional objectives; this 
test was again administered following the second lesson. The class mean residual 
gain score was the statistical unit of analysis repreisenting teacher effectiveness 
of concept instruction. 

Despite the fact that the identification of relevant classroom behaviors has 
been based on theory or on empirical evidence, there is no assurance that the qiianti 
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fication system developed to describe the behaviors will be valid and reliable. 
If an investigator is to develop a new quantification systen, a niimber of major 
decisions must be made. Each decision point serves as a source of invalidity. 
The actual quantification system, at best, should be viewed as hypotheses that 
certain definitions for variables and certain ways of recording behaviors are re- 
lated to student outcome measures. The investigator must make decisions about 
the recording procedure, item content, coding format, and the unit of analysis 
for each observational system developed (Borich, 1977) . These characteristics 
of observational instrtmients will be discussed below in the context of the three 
instruments developed for this study: the Observational System for Concept In- 
struction (OSCI), the Tally Form and the Rating Form. 

Recording Procedure. Sign or category procedures are used to record the 
frequency of the behavior under consideration. If an event is recorded only once, 
regardless of its actual frequency of occurrence, the recording instrument is 
called a sign system. If an event is recorded each time it occurs, the recording 
instrument is called a category system. A rating system is usually viewed as a 
modified sign system, wherein the observer makes on estimate of the frequency of 
an event, usually at the end of an observational session. 

Two category systems and a modified category rating syi^tem were developed for 
this study. The Observational System for Concept Instruction (OSCI) was designed 
to record the sequence and frequency of the teacher (and student) strategic variables 
(see Figure 1). The actual recording of behavior procedure employed with OSCI ' 
played no role in the quantification process; the quantification occurred after the 
data were gathered and frequencies were summed across the categories. 

The Tally Form (see Figure 2) was developed for the quantification of the ade- 
quacy of the teacher's substantive presentation variables. From the set of instruc- 
tional objectives, the investigator derived fifteen '-nowledge components, or-generali- 
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zations, which encompassed the essential infonaation required to know the meaning 
of the concept and to fulfill the tasks implied by the objectives (see Figure 3). 
These generalizations defined the adequacy of knowledge coverage variable* A list 
of nine essential concept labels was also derived from the instructional objectives 
to define the adequacy of concept label coverage variable. 

The trained observer listened to the audio-tape recorded lessons for verbal 
indications of the teacher's explicit inclusion of each of the knowledge components 
and of each of the concept labels. The essence of the meaning of each of the know- 
ledge generalizations was sufficient; it was not necessary for the teacher's ter- 
minology to be exactly that of the listed generalizations. A tally was made on 
the adequacy Tally Form by the observer for each time the teacher actually stated 
each of the knowledge components and each of the concept labels. Thus, the fre- 
quency of occurrence of each generalization could be computed for each lesson 
separately or for the combined lessons. 

A modified category-rating system was also developed (see Figure 4) to quan- 
tify the accuracy and relevance variables, the three semantic \ariables, and the 
enthusiasm and class control variables. A seven step graphic scale was employed 
with the anchors specifying the quality of performance, ranging from (1) very poor 
performance to (7) very good, outstanding performance on each particular variable. 

The observers were trained to listen to the audio-tape recorded lessons for 
specific examples of teacher behavior which would be indicative of the character- 
istic underlying each variable. For example, two variables were concerned with 
the accuracy of concept-sp^ecif ic substantive behavior: accuracy of concept de- 
finitions and accuracy of concept examples. As the observer heard the teacher 
provide either a concept definition or an example, the observer made an assessment 
of the degree of accuracy of the teacher's statement. The observer made ratings 
along the continuum for each variable to indicate judgments made during the audio- 
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tape critique. At the end of the lesson, a summary irating, or an average of the 
ratings made for specific bshavioral instances, was made. 

The rating scale was constructed in this manner in an attempt to increase the 
level of objectivity of the ratings by requiring the observer to focus on specific 
units of behavior. However, the attem] cuaX frequency and quality of 

behavior proved problematic. The obse ' Jtl with the doubly-difficult 

task of (1) judging whether or not a particular kind of behavior occurred and de- 
termining what kind of event it was (a qualitative judgment); and of (2) assessing 
the degree to which a particular quality was present in the be^l^avioral sample 
(a quantitative judgment). This task was espepially difficult with the semantic 
variables where it was frequently difficult to find discrete, easily identifiable 
examples of behavior which related to the variables as defined. The observer's 
task was considerably less difficult with the accuracy and relevance variables; 
the specific units of behavior — the concept definition, concept example, and 
teacher utterances — were overt, discrete, arid easily distinguishable. However, 
the difficulty of specifying behaviors representing the different levels of quality 
implied in each variable contributed to observer bias. 

Item content . Item content specifies the level of inference demanded from th^ 
data and from the observers. Rosenshine distinguishes between low and high infer- 
ence responses. Low inference responses or variables tap the directly observable, 
specific, explicit phenomena of the environment. High inference responses or vari- 
ables ask the observer to make a wholistic, global judgment about the meaning of 
what is occurring. Low inference measures are commonly thought to maximize the 
objectivity of the data; that is, the more molecular the variable, the more objective 
the measurement can be. The recording of a sequence o^ interaction can be best 
captured by employing a low inference system. High inference variables are usually 
used to assess general teacher characteristics not easily measured by discrete be- 
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haviors. 

At one time, rating forms were practically defined as requj.ring high Inference 
judgments whereas category systems implied the use of low inference behaviors. This 
is no longer the case. Items requiring a low level of inference can be found on 
rating forms while items on sign or category system? may demand a high or moderate 
level of inference (Rosenshine, 1973). The critical ension regarding the level 
of inference demanded deals with how directly obsci c the relevant behavior is. 
How much judgment must an observer undertake in order to code a particular behavj.or?. 

Assessing the three instruments employed in this study in terps of item conr- 
tent, one finds a range from high inference (enthusiasm) to moderate inference 
(adequacy of knowledge coverage variable) to low inf^.rence (off-task talk) variables. 

Many of the OSCI items demanded at least a moderate, level of inference from 
the observer. The observer had to be knowledgeable enough regarding the instruc- 
tional content to be able to discern a positive or negative concept example pr a 
concept definition when provided by the teacher. In very few cases would the 
teacher signal that a particular statement served the purpose of a definition, an 
example, or a review. Rather, the observer's task was to infer the intent of the 
teacher's verbal behavior as specified by the categories of the observational sys- 
tem. With OSCI, the observer must be engaged in a content as well as a process ana- 
lysis simultaneously. Several category decisions are a function of previously 
occurring content and processes; thus, considerable information must be stored by 
the coder who successfully employs the OSCI. On the average, 2^-3 hours were 
necessary for the coding of each 45 minute lesson. This indicates at least a 

moderate level of difficulty and certainly more than low inference judgments. 

The adequacy of knowledge coverage variable on the Tally Foirm also demanded 

at least a moderate level of inference from the observer. While the observer was 

looking for discrete, overt, concrete behaviors - the appropriate teacher utter- 
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ances were not consistently easily observable, The observer's task was tp assess 
the meaning of the teacher's explicit statements containing a knowledge component 
and to match that meaning to one (or none) of the generalizations stated on the Tally 
Form. The observer's task on the adequacy of concept label coverage variable, by 
contrast, was low inference. The teacher either did or did not say the name of 
the concept. 

The variables on the Ratir^ ti demanded a moderate to high level of infer- 
ence from the observers. Th c^.u and relevance variables involved comparing 
the teacher's behavior with a standard body of knowledge. This meant that the 
observer had to be a subject matter expert in order to accurately assess these 
variables. The enthusiasm and on-task variables demanded high inference judgmeats; 
attempting to identify specific examples of behaviors illustrative of these two 
variables to count during the lesson proved to be difficult. Revision of the Rating 
Form should include renaming the enthusiasm and on-task variables possibly as 
paired rating scales (stimulating vs. dull; alert vs. apathetic; businesslike, 
task-oriented vs. laissez faire) to be assessed once at the end of an observation 
or as ratings which are made every five minutes (or so) during the observation. 

Coding format . A single coding format records a behavior on one dimensipn 
(Borich, 1977) while, with a multiple coding format, a behavior is coded according 
to any number of dimensions (Rosenshine, 1973). The OSCI has a type of multiple 
coding format: behaviors are subdivided into (a) type of speaker — teacher or 
student, (b) type of communication — question asking, or information giving and , 
(c) relationship of communication to task-on-task vs. off-task talk. Behaviors 
are coded only once, however, but are recorded as they occur sequentially. 

Unit of analysis . The unit of teacher behavior which is coded on OSCI is the 
purposive move. A purposive move refers to an activity performed by the teacher 
which has the apparent function or effect of progressing the lesson from one sub- 
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stantive or process point to another. Each purposive move is a statement or 
question which expresses a more or less complete idea and which serves a single 
function, as defined by the categories of the observational system. Every change 
of purposive move or speaker necessitates a new coding entry, (See Figures 5 
and 6 for examples of coded teacher and student verbal behavior using the OSCI,) 
A purposive move should be distinguished from an utterance; and utterance is a 
verbal expression performed by one person at a given time. An utterance may con*- 
tain one, severa^ 've moves, OSCI is able i:o record the frequency 

and sequence of purposive moves, but not the duration of each mpve. To the ex«^ 
tent that time spent on particular purposive moves influences^ student learning, 
the absence of a duration-weighting mechanism on the OSCI is a source pf distor- 
tion. 

Various units of behavior were necessary for the variables on the Rating 
Form. The observers attempted to assess definitions given or accepted, examples 
given or accepted, and teacher utterances for the accuracy and relevance variables. 

Reliability of Observation Instrument r . 

The accuracy any observational system is partially r unction of (a) the 
consistency of ob. jrvations among tho«e judging the behavio and (b) the test-retest 
reliability or the stability of teacher behavior measured across changes in pupils, 
content, and/or time. Following is a discussion of these two aspects of reliability 
as they relate to the three instruments employed in this study. 

Rater consistency . The investigator and a trained assistant were the ob- 
servers for this study. The training program cpnsisted of four parts: (1) gaining 
familiarity with th- substantive aspect of the concept specialization; (2) learning 
the definitions -nd distinguishing characteristics of the teacher (and student) 
process variables; (3) practicing coding and rating the process behaviors using 
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pilot test data; and (4) establishing inter-coder agreements. Three eight-hour 

training sessions were held before the criterion reliability level of 0.80 of 

observer agreement was achieved for each of the three instruments. 

Coder agreement data were gathered by having the two observers independently 

critique the same audio-taped lesson using the OSCI. A second lesson was critiqued 

independently by each observer using the Rating and Tally Forms. Coefficients of 

observer agreement on OSCI were calculated by using the formula proposed by Scott 

(Flanders, 1965). Scott's coefficient, pi, ( tt ) is determined by the formula. 

ij = Po^Pe 
1-Pe 

where Po is the proportion of agreement and Pe is the proportion of agreement ex- 
pected by chance, which is found by squaring the proportion of tallies in each 
category and summing t:hese over all categories. 

Levels of agre&i-. it betiv/een observer one and observer two at the end of the 
training period were ri = 0.92 for each of two independently coded audio-taped 
lessons of forty-five ni±rti:tes each. At a mid-point in the data coding period, 
a second coefficient of obse:rver agreement was calculated, utilizing one of the 
originally coded audio->:rapeF. Scott's coefficient of agreement was IT = 0.90. 

Consistency checV :-v time were also computed, comparing each observer's 
degree of agreement with so.lf on the OSCI and the Tally Form, For observer one, 

TT =0.83; for observer two, IT = 0.86. Rating Fonns vere marked almost identi- 
cally by the two obser^ rs on all consist ^ncy checks. 

Observers were bl vnd to zhe criteria, ft variable dati during the entire coding 
period. The classroom prcres^s data were contained on twenty-two audio-tapes. 
The tapes were stratified aZ^^rig grade levels represented and then randomly di- 
vided into two sets. Each _ the observers was assigned a set of eleven tapes to 
which the OSCI was appli£— The observers then exchanged sets of tapes 
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and applied the Tally and Rating Forms to this new set. This procedure was em- 
ployed to achieve Independence of observations between the OSCI and the other 
two measures. 

With experienced classroom observers and with sufficient training, acceptable 
levels of rater agreement can usually be achieved. That was the case for this 
study. However, in planning a replication study, reliability can be improved 
upon by (a) increasing the number of observers, and (b) excluding the investigator 
from the observer pool. In addition, independence of observations can be increased 
by having observers apply only one observational system to each classtoom sample. 

Teacher stability . Is teacher behavior reasonably stable across content, time, 
and pupils? Which aspects of behavior might be likely to be stable and which types 
of behavior might be expected to vary across various changes in setting? These 
remain unanswered questions. If teacher behavior varies widely across conditions, 
a separate index of teacher skill would have to be constructed for each situation 
in order to assess teacher effectiveness. Shavelson and Dempsey (1976) report 
equivocal findings in their review of the generalizability and stability of measures 
of teacher behavior. Lack of standardization of measures contributes to an inability 
to draw comparisons across studies. However, in general, it appears that the 
global, high inference ratings on teacher behavior appear to be more stable than the 
low inference, counted measures. Rosenshine (1970) reports moderate consistency 
in teaching behavior when the same material is taught to different pupils. This 
generalization was summarized from a limited number of studies, however. While 
it appears that teacher behavior may be moderately consistent over brief periods 
of time, behavior is less stable over time and across changes in content. Borich 
(1977) suggests that we may not be tapping the kinds of behaviors which are rela- 
tively stable over time and/or presently used instruments may be confounding the 
data. 
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For this study, teachers conducted two lessons of forty-five minuteg each on 
consecutive days to the same group of students. One set of instructional objectives 
was u«ed to guide the teacher's corstruction of both lessons. The correlations 
for the low inference, counted variables recorded on OSCI and the Tally Form are 
shown o\z Table 1, 

Table 1 here 

One might expect some degree of consistency of strategic and substantive 
behaviors given common content and students. However, given the result < ' l 
studies in which low inference variables showed little stability even across two 
lessons, the correlations shown on Table 1 are surprising. A number' of teacher 
behaviors measured by OSCI remained fairly stable: the giving of concept defini- 
tions (0.49), the giving of positive concept e:,camples (0,48), signalling a change 
in the topic (0.45), explaining (0.67), asking a low order question (0.70), asking 
a high order question (0.48), signalling and changing the topic simultaneously 
(0.43), and off-task behavior (0.95). Low frequencies of behavior on several of 
the variables may have contribured to low stability coefficient s. The means for 
the adequacy of content coverage variable represented the average number of know- 
ledge components provided by the teachers in each lesson. Apparently, teachers 
were very,„,gonsistent in their provision of knowledge (0.54) and of concept labels 
(0.69). 

One wonders about the generalizabllity of these concept-related behaviors 
across time, type of student, type of concept and across the teaching of concepts 
from other disciplines. It is hoped that similar investigations can be conducted 
to examine these relevant variables. 

The stsDility coefficients for the rated teacher variables are shovm on 
Table 2. 

Table 2 here 
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Not surprisingly, the coefficients are consistently high. The question Is whether 
this degree of stability is an attribute of teacher behavior or ^n artifact of the 
measurement procedure. The dimensions of teacher behavior assessed by the vari*- 
ables on the Rating Form are not intended to be mutually exclusive. It is con- 
ceivable that any given teacher could perform consistently well or poorly on each 
of the variables. However, several problems with the Ratinr Form ^-^^^'ables ar^ 
apparent; these conc^ ' Ions probably influenced the observers to make subjective 
and impressionistic assessments. 

The rater's task was to make assessments of behavioral indications of each 
variable. However, relevant information available to the rater varied from one 
variable to another and from one teacher to another. This variability may hav^ 
encouraged the rater to be influenced by other, unknown characteristics of the 
teachers. Also, for most variables, the definitions proved to be inadequate for 
the range of behaviors encountered - 

As shown in Table 3, high interccxrelations are evident for all of the rated 

variables. 

Table 3 here 

This makes one cautious about calling each variable by a separate name. One won-* 
ders about the intercorrelations of rated variables in other studies which have 
reported high stabiZLity measures for rated behaviors. 

0 

For future investigations, confidence in the rated measures can be enhanced by 
(a) specifically defining each variable, providing example behaviors at points 
along the scale; (b) increasing the number of independent ratings; and (c) de- 
signing high and low inference independently derived measures of the same character^ 
istic. This last point will enable the examination of the degree of correspondence 
between sezs of logicaZlly related behaviors. 
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Validity of Observation Instruments 

Most teacher behavior investigations aim at establishing relationships be-^ 
tween measures of behavior and a criterion taeasure. When the measures are col- 
lected at about the same time, the effort ifl one of establis ^ concurr Jc ili- 
dity. Predictive validity implies the abJilLy of the behaviors to relare to 
achievement over time. Tables 4 and 5 present the teacher behavior correlates 
of student achievement on concept tasks adninistered Immediately following in- 
struction. 

Tables 4 and 5 about here 

(See Armento, Beverly, "Teacher Behaviors Related tc Student Achievement on a 
Social Science Concept Test," A paper presented at AERA, 1976 for a discussion 
of these data) . 

The validity issue surrounding teacher behavior studies is that of construct 
validity or the ability of observation systems to measure the teacher or student 
behaviors they purport to measure. Borich (1977) proposes that observational 
systems should be able to demonstrate con^'ergent and discriminant validity. That 
is, a particular behavior measured on one instrument should correlate signifi- 
cantly with a similar or same behavior measured on another instrument. In addi- 
tion, that correlation should be "higher than either that between dissimilar 
behaviors on the same instrument or that between dissimilar behaviors measured 
by different observation coding instruments"^ (Borich, 1977, p. 20). 

In the study being reported, three instruments were employed. No obvious 
attempt was made to define the relevant variables along different types of scales; 
thus, minimal data exist to examine the convergent and divergent validity of 
the measures. However, an example of this procedure can be illustraited. 
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hods 



T. behaviors 
Gives definition 



Off-task 
Behavior 



T, behaviors 

Adequacy of On-task 
Content Coverage rating 



B 



1 (.49) 

2 -.06 

1 .43 

2 .32 



(.95) 

-.43 

-.71 



(.54) 
.75 



(.96) 



In the above illustration. Method A is OSCI and Method 3 rep?:es^ntg both 
the Tally and Rating Forms. The teacher strategic behavior, gives concept defi- 
nition, can be viewed as similar to the adequaqy of cpntent coverage. While the 
on-task rating should be strongly inversely related to the actual counting of off^ 
task behavior, one would not expect the on-task behavior to diverge from A^^ and 
Bj^ variables. Rather, the on-task rating should be measuring behavior contained 
in each of the Ai and B^ variables, and thus can be expected to be a positive correlate 
of same. 

The premises underlying convergent; and discriminant valid:^.ty are: (1) the 
correlation between the same behavior measured by the same method (reliability) 
should be higher than (2) the correlation between the same behavior measured 
by two different methods — which in turn, should be higher than (3) the correlation 
between two different behaviors measured by the same method — which in turn, 
should be higher than (4) the correlation between two different behaviors measured 
by two different methods, (Borlch, 1977). By using the premises, one can see 
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that relatively good convergent and discriminant validity is Indicated for the 
behaviors, giving definitions, covering content, and off-task behavior. The 
on-task behavior variable behaves as expected. 
A second illustration can be examined: 

Methods 



T. Behaviors 
Gives example 

1 

1 (.48) 



Off-task 
Behavior 



T. Behaviors 



Adequacy of concept 
label coverage 



On-task 
rating 



2 (-.42) 
1 .47 



(.95) 
-.30 



(,69) 



.61 



-.71 



.61 



(.96) 



Again, the convergence of the gives example and adequacy of concei^^t label coverage 
variables supports the notion that these measures are assessing similar behaviors. 
Both convergent and discriminant validity are relatively good for three behaviors, 
with the on-task rating converging with the specific on-task behaviors. 

The Intercorrelations for the behaviors measured by OSCI and the T^lly^o^ appear 
to be internally consistent; that is, related and unrelated items c orrelate 
as predicted. This cannot be said for the variables assessed by the Rating Form, 
where all behaviors converge, and thus probably do not measure identifiable aspects 
of teacher behavior. 
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Summary . Future examinations of concept instruction should include a 
broader range of high and low inference measures vhich are designed to assesB the 
same or similar dimensions of teacher behavior. This provision will enable a more 
thorough assessment of the construct validity of the instruments. In addition, 
the semantic variables, in particular, need to be reconceptuallzed and a more 
reliable measure developed for their assessment. Each of the varj^ables 
presently measured by the Rating Form Is In need of refinement a^idl redefinition. 

The behaviors measured by OSCI and tha Tally Form app^r to be more 
accurate assessments of the variables as defined. Several behaviors demonstrating 
at least a moderate degree of stability also related significantly to 
student achievement: the adequacy of content coverage and tl\e adequacy of 
concept label coverage as measured by the Tally form; and the teacher gives 
concept definitions and gives positive concept examples. 

Substantive, semantic, and strategic teacher behaviors can be revised 
on the basis of this study. However, these basic categories of behavior 
continue to be viable; changes are apparently needed In the type of measurement 
employed with a few of the variables. 

It is hoped that replication and extension studies will be conducted 
with the revised instruments to test the generalizabllity of the more promising 
substantive and strategic behaviors across changes in pupils, time, and 
type and content of concept instruction. 
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Figure 1 

Observational System for Concept Instruction (OSCI) 
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Symbols 

Teacher Behavior: 

represents the teacher's giving of either 
a concept definition or a concept example. 

? represents the teacher's asking for either 
a concept definition or a concept example. 

p represents a positive concept example. 

n represents a negative concept example. 

Student Behavior: 

In response to a teacher's low or high order 
question, (C,L,T,I) represent the following: 

C is a correct response. 

L Is a logical response. 

T Is a true . . . but response. 

I is an incorrect response. 

c represents a correct concept example or definition. 
1 represents an incorrect concept example or definition, 
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Figure 2 

Tally Form (Short Form for Data Collection) 



Know ' -^ dge Components 

Specialization is the concentration, focusing 
on a small aspect of some whole • • . • 

Specializing accentuates and creates 
differences . 



Division of labor implies role differentia- 
tion 



Specialization occurs in at least three t 
forms: technological, occupational, and o 
geographical g 

i.\m of specialization is efficiency • 

Specialization allows people and regions to 
use to best advantage their differences in p 
skill, knowledge, interest, and resources . . r 

Specialization necessitates interdependence 

Specialization necessitates trade • • . . 

Specialization implies certain problems; 

need for interdependence 

possible loss of efficiency in one area • 
low transfer of specialized skills 
possible boredom 

Concept Labels 

Specialization 

Division of labor 



Technological specialization 
Occupational specialization 
Geographical specialization 

Interdependence • • • • 
Dependence 

Trade 

Exchange 



(26) 

Figure 3 

Generalizations Defining the Basic Knowledge 
Implied by the Instructional Objectives 

1. Specialization is the concentration or focusing upon 
some small aspect of a defined whole. 

2. The process of specializing creates and accentuates 
differences. 

3. Division of labor implies role differentiation • 

4. Specialization occurs as the Level of technology is 
differentiated to replace humen resource^. 

5. Specialization occurs as human roles are differentiated 
in occupational endeavors. 

6. Specialization occurs as geographical regions serve 
differentiated functions. 

7. The major aim of any of the three forms of specializa-r 
tion is increased efficiency, or the production of 
more from fewer resources. 

8. Specialization allows people to use to best advantage 
their differences in skill, knowledge, interest, and 
resources. 

9. Specialization allows people to use regional differences 
in natural resources to best advantage. 

10. Specialization necessitates interdependence. 

11. Specialization necessitates exchange or trade. 

12. Specialization implies the need for interdependence; 
this can be a problem. 

13. When one specializes in one aspe^ct of production, it is 
likely that one will lose efficiency in other areas 

of production. 

14. There is often a low degree of transfer of specialized 
skills and capabilities from one aspect of production 
to another. 



15. 



Certain repetitive specialized tasks often bring the 
problem of boredom. 
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Rating Forn 



:- e 4 

-isurement of High Inference 
rocess Variables 



^ting Code: 



1 = ver; r p 

2 = poor arrf r-' 

3 = sligltl:- 

4 = avera^v 

5 = sligh:: 

6 = good ' - » ' 

7 = very jr.dUJ 



formancQ on this variable 
?.ce on this variable 
7 average on this variable 
mance on this variable 
average on this variable 
e on this variable 
:5tanding performance on this variable 



1 . Accuracy of C 

a. Def initio 
accurate. 



specific Teacher Behavior 



or developed by the teacher are 



b. Examples g: 
represent 



r accepted by the teacher correctly 
ncept . 



5 6 7 

Sub- total — Accuracy 



2. Relevance c : TTee : a Behavior to Instructional Qb-jectives 



Teacher 
of the 



es are appropriate to the achievement 
,onal objectives for the lesson. 



5 6 7 

Sub-total — Relevance 



3. Teacher Language 



a. The teacher uses a balance of abstract-concrete 
words. 



1 2 3 4 5 6 7 
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L 



'rcr py utterances are tninizaal. Complete -entences 
.utilized • 



1 2 3 4 5 6 ^ 

15 words" are mi:r.,^>^Ll . The ref eren: v 
i is seldom In dou. : pronouns clearly :er 

antecedents. 



3 4 5 6 7 

S;^ V"-total~Semantics 



"•a r-^.:?Ter expresses entkusiasm arid interest che 
cntan f the lesson. 



1 2 3 4 5 6 7 

Sub-total — Euthusiasm 
5 Tae lec: ler displays an on-task approach. 



2 3 4 5 6 7 

Sub-t otal — On-task 



TOTAL 
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Figure .3 

■:of Cocled Teacha-r Strategic B - - vior 
-vattlo:, :.l Systen fcr Cotic^Pt Zi^ ^ rnclXon 
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In Svxarr of coded teac -^^'-r v^^bal bijiiisvioi' , 

the te^^^^*eir xECTi^-^ the major ide:^: al^^^ad]' PTrasented 
in les^cru (';2L . isigr. .is a change 1- :lie top±:c 3), c^a^^ges 
the topi^ (frep^rr-^er: led by tb> cir:-'^ iA #3} , gives ^ 
concept ::£fiJximon 4), gives rhre^ -o^ttiv^e concept 
examples CJ-,^,";''* ^r.d then asks a- order quasrtion f^) . 
All cf thi$ sbor^' r^irposive moves 0c:::2'jr^^d o::5 utite^^"" 
anc i:. 



?tgt>^^ 6 

Ail ErsTi^ie Coded Teaeh^^""Student I*roce^^ Beh^^ 
Ucing ih^ ot^^^^ational Sy^*^^ Concept ^nstrnc. 




l:he following ^eacl^^ir-stucie^i^ interaction ^epi^^r?^^ b7 
the coding gti^^ in t^igure 6- 

T: C^n n^n^e an ^^^t^^^le occupati^^l 

(tn ^^^e 2 the teatJ^^^ a^^s for a J^^^ttiv^ zoucept 
e^caiuPl^.) 

S: My father works on a^^^mbly li^^ at ^^"^^ 

s^ec^^li^e^ in inst^^'^'ling tubes in ^Oior '^r:. sets- 

(in tb^ student Sives ^ cot^^ect ^oticepi sxample) , 
S: Hy tjig si^^^r is gt^^^^ying to b^ a i^^^se* I^^at a 

s^ec^^li^ed kind of Job. (in ^, a ^tuden: zi^^ss a 

cc^rjr^^t concept ex^^^^leJ 
T: ^*^s. y^u'i^e both cC^^^^^ct* Another ^^ampl^ tnig^it 

occupational thetaj?^^ts» (in 5, ^fie teacfc-r gives 

^ positive Concept e^ampl^O 
S: I do^'t understand. What's that? (tn 6, the 

^tud^^^ e^^Piresse^ ujlsnnd^^Qtandl^g J 
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E:"^3.iit:;- of Teacher Low 
acr^^s Two Social 5 



^2^ 



Teacher Low Inf c^re:::^-^ 
Process Variables 


Lli$^ 


„:lsson^ 


Second 
Y 


j sf 


_ Stabilltv 
oexficiecit 


Gives Concept Defii>^ic- 




2.61 


2.09 : 


3-Ci7 


0.49* 


Asks Zor Concept De-^^ti^-iozi 


i 




3.00 


3,-.. 


0.33 


Gives Positive Conc^Pt :£x£^ple 




~-51 


6.41 


6., 


0.48* 


Gives Negative Conc ^Pt Example 




0.53 


0.23 


■ O-i; 

i 


-0.19 


Asks for positive C<^^Cept Example 


: 




0.91 


' 1 ... • 


0.24 


Asks for Negative G^^"Cept Examole 




- Does Not 


Oc_. • - 




Signals a Topic Chs^r^S^ 




4.48 


3.64 




0.45* 


Reviews, Summarlzei^ ^in I^^^a 




3-14 


5.00 





0.14 


Explains, Answers 






8.50 




0.67** 


Asks low order Ques-^^u 




21.56 


36.91 


18,77 


7.70** 


Asks High Order Que^'ton 




6.44 


7.27 


7.5 


:..48* 


Repeats Question A£^^^ student Response 




2.81 


2.18 


2.36, 


0 . 25 


Rephrases Question ^^-^ter Student Resp.:>nse 




3.43 


3.27 


-•07 ; 


0.35 


Signals and changes ^^e lopi^c Simultsiuoousi/ 




3.90 


3.32 


:_io 


0-43* 


Changes Topic With ^ W Order Questia^n 




7.12 


A. 27 


4.37 


0-39 


Changes Topic Witt ^ Ktgh Order Question 


i ■ 


2.63 


0.91 


l.ll 


-C . 01 


Asks ^^airs of Que^-ji-'^s 




9.62 


7.05 


6.24 


C.33 


Tells Students to irr^levanu Beha-jioir 




16.86 


9.14 


13.34 


0.95** 


Other 




1.61 


1.32 


1^.3-9 


-C. 03 


AdeqiJ^cy of Content ^^Ver^Se 




.10.56 


17.55 


i - 08 


D.54** 


Adequacy of Concept i^bel Coverage ^ 


.11^1 


7.3J^ 


10.64 




0.69** 



* P 5.05 
** p 5. 01 

ER?C 
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Table 2 

" .tabi.litr:7 of Teacher Uis--. Inferen.ae Beha\'X-ors 
:rz:r2 :^:7o Social . Sci^-" ^i Cc^nce -i^t Lessons Tacg:it; on 





Sic Uity 
Caeii^ilcient 


A'-::cur :zrT of definitions 






•^ccuiracp of exaiEplsS 






j::ale'~:mi:e of behavior to . 


sCtlves 




B^-Ihuc:^ collier ete/abstr£^ 
t ezrnnm o logy 




0.?-** 


Use 0^ ~^^Traplete sentences 






Proper nse of prcixouns 




05** 


r x3plr;77B entSiusiasin 






"Zstahilishes control over -SHirtiing 
sir^rion 


■}.96^* 



* P ^ .05 
** p -01 
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Zero Order ."orrel .. "icr:. Matrix for Hated 
Teac.ner Prucr.e^i; Variables 



Lefines Concept accural r:ly 

2» Provides ac:^urate conce;. z 
examiplfis 

3. Expresses behavicr rel^^ranz 
no cDjectiTT-ss 

A, Aichlsv-es a balance between 
concrete amd abstract Ler- 
irinology 

5. Uses connplece sentences and 
clear procoon referent:^ 

6. Displays intrarest and enthu- 
siasm over the contenr of 
the lesson 

7. Displays primarily ov t:ask, 
_ow no-Lse behavilor 



.94 



,87 .78 



,34 



.76 



TO 



.89 



.81 i .77 .90 



.92 .85 



.83 ; .91 



.75 I .84 



.£3 ; .86 



.87 
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TABLE 4 . PEARSON PRODUCT MOMENT GORRHLATIOJ? cOEF^lcrrNTS 
FOR LOW INFERENCE TEACHER PROCESS VARlTABLES AND, Cl^^S 
RESIDUAL MEAN GAIN SCORES * 12) 



Teacher Process Vari^ables Correlation cl^'®^-?? ^ 

Gives concept definit:ion ^426 .02* 

Asks for concept definition ,190 

Gives positive ccucept exasiple ,497 .009** 

Gives rusgative caoicept example .225 

Asks foT positive concept example .177 

Asks for negative concept example Dees nOw occu^- 

Signals a topic change -,008 

Revieiws , summarizes Tnain ideas .376 ,0i* 

Asks lower order questions -.09S 

Asks higher order questions -.047 

Repeats quesTiicn after student 

response --•309 

Rephrase? question a i. :er studenrt 

response -.257 

Signals smd changes th ^ topic 

siroultamcjously l71 

Uses Tevi^w-signal-shi£t pattern .122 

Changes tropic with a L:dw order 

question -.301^ 

Changes tapic with a 'vigh order 

question ...162 

Asks pairs of questions -.013 

Tells stu<^.^'ts to stop irrelevaTir 

behavior 050 

Other, including sub- tantive 

digre'<^:acn5 - .0^43 

Adequacy ' content covsr^'^^e . 456 .Dl** 

Adequacy of concept label 

coverage .528 .Gax€** 



*p l-iOS 
**p t .01 
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TABLE 5 . PEARSON PRODUCT MOMENT CORRELATION COEFFICIENTS 
FlOa HIGH INFERENCE TEACHER PROCESS VARIABLES AND CLASS 
EiHSIDUAL MEAN GAIN SCORES 



Teaclher Prczess Variables 


Correlation 


Level of 
Significance 


Acoarac/ of concept definitions 


.326 




i\L^*-iuuciLy un concepx^ exampxes 


TIC 


. 04* 


Relev^ance of behavior to 
cojectives 


.370 


.04* 


Balance between concrete and 
abstract te rminology 


.381 


.04* 


Zlses complete sentences and 
correct pronouns 


.274 




Expresses interest and enthusiasm 
over content of lesson 


.478 


.01** 


On-task, low noise behavior 


,279 





*p < .OS 
**p * .01 
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